{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "https://www.bilibili.com/video/BV1u64y1i75a?p=2&vd_source=5cfd1b04987819146fefc856af2954b1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "两个角度入手，使得每层得到的输出尽可能的稳定：\n",
    "1. 权重的初始化\n",
    "2. 激活函数的选取和调整\n",
    "\n",
    "方法：每一层的正向输出和反向输出（梯度）的元素均值和方差要尽可能相等"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 1.如何合理的初始化权重"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "假设：输入数据符合均值为0，方差为1的分布\n",
    "\n",
    "要求数据在网络中计算时的稳定性，即要求：**数据每层的输出（正向和反向）的均值也要为0，方差也要为1**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1.1 Xavier随机初始化"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "某全连接层的输入个数为a，输出个数为b，则该层的参数元素满足均匀分布\n",
    "\n",
    "$$\n",
    "    U(-\\sqrt{\\frac{6}{a+b}}, \\sqrt{\\frac{6}{a+b}})\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 2.激活函数的选择和调整"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "根据计算， 越接近y=x函数，数值稳定性越好\n",
    "\n",
    "查看常用的几个激活函数的图像，并对其进行泰勒展开\n",
    "\n",
    "![image-20240720163323393](https://zyc-learning-1309954661.cos.ap-nanjing.myqcloud.com/machine-learning-pic/image-20240720163323393.png)\n",
    "\n",
    "发现在0附近，tanh和relu与y=x非常接近，对于sigmoid需要做一定调整"
   ]
  }
 ],
 "metadata": {
  "language_info": {
   "name": "python"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
